AITopics

Country: Europe (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Neural Information Processing SystemsMar-15-2024, 02:42:57 GMT

Message-Passing for Approximate MAP Inference with Latent Variables

We consider a general inference setting for discrete probabilistic graphical models where we seek maximum a posteriori (MAP) estimates for a subset of the random variables (max nodes), marginalizing over the rest (sum nodes). We present a hybrid message-passing algorithm to accomplish this. The hybrid algorithm passes a mix of sum and max messages depending on the type of source node (sum or max). We derive our algorithm by showing that it falls out as the solution of a particular relaxation of a variational framework. We further show that the Expectation Maximization algorithm can be seen as an approximation to our algorithm. Experimental results on synthetic and real-world datasets, against several baselines, demonstrate the efficacy of our proposed algorithm.

algorithm, assignment, node, (15 more...)

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Utah (0.04)
North America > United States > Maryland (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.35)

Neural Information Processing SystemsMar-12-2024, 13:01:22 GMT

Blazing the trails before beating the path: Sample-efficient Monte-Carlo planning Jean-Bastien Grill Michal Valko Rémi Munos SequeL team, INRIA Lille - Nord Europe, France Google DeepMind, UK

artificial intelligence, machine learning, node, (18 more...)

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Europe > France > Hauts-de-France > Pas-de-Calais (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Cazenave, Tristan, Legras, Swann, Ventos, Véronique

Optimizing $\alpha\mu$

arXiv.org Artificial IntelligenceJan-29-2021

$\alpha\mu$ is a search algorithm which repairs two defaults of Perfect Information Monte Carlo search: strategy fusion and non locality. In this paper we optimize $\alpha\mu$ for the game of Bridge, avoiding useless computations. The proposed optimizations are general and apply to other imperfect information turn-based games. We define multiple optimizations involving Pareto fronts, and show that these optimizations speed up the search. Some of these optimizations are cuts that stop the search at a node, while others keep track of which possible worlds have become redundant, avoiding unnecessary, costly evaluations. We also measure the benefits of parallelizing the double dummy searches at the leaves of the $\alpha\mu$ search tree.

algorithm, node, pareto front, (17 more...)

2101.12639

Country:

Europe > France (0.04)
North America > Canada (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games > Bridge (0.90)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)

Tatavarti, Hari Teja, Doshi, Prashant, Hayes, Layton

Recurrent Sum-Product-Max Networks for Decision Making in Perfectly-Observed Environments

arXiv.org Artificial IntelligenceJun-12-2020

Recent investigations into sum-product-max networks (SPMN) that generalize sum-product networks (SPN) offer a data-driven alternative for decision making, which has predominantly relied on handcrafted models. SPMNs computationally represent a probabilistic decision-making problem whose solution scales linearly in the size of the network. However, SPMNs are not well suited for sequential decision making over multiple time steps. In this paper, we present recurrent SPMNs (RSPMN) that learn from and model decision-making data over time. RSPMNs utilize a template network that is unfolded as needed depending on the length of the data sequence. This is significant as RSPMNs not only inherit the benefits of SPMNs in being data driven and mostly tractable, they are also well suited for sequential problems. We establish conditions on the template network, which guarantee that the resulting SPMN is valid, and present a structure learning algorithm to learn a sound template network. We demonstrate that the RSPMNs learned on a testbed of sequential decision-making data sets generate MEUs and policies that are close to the optimal on perfectly-observed domains. They easily improve on a recent batch-constrained reinforcement learning method, which is important because RSPMNs offer a new model-based approach to offline reinforcement learning.

machine learning, node, reinforcement learning, (19 more...)

2006.073

Country:

North America > United States > Georgia > Clarke County > Athens (0.14)
Europe > Denmark > North Jutland > Aalborg (0.04)

Genre:

Workflow (0.69)
Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Cazenave, Tristan, Ventos, Véronique

The {\alpha}{\mu} Search Algorithm for the Game of Bridge

arXiv.org Artificial IntelligenceNov-18-2019

{\alpha}{\mu} is an anytime heuristic search algorithm for incomplete information games that assumes perfect information for the opponents. {\alpha}{\mu} addresses the strategy fusion and non-locality problems encountered by Perfect Information Monte Carlo sampling. In this paper {\alpha}{\mu} is applied to the game of Bridge.

node, pareto front, vector, (16 more...)

1911.0796

Country:

North America > United States > Oregon > Multnomah County > Portland (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games > Bridge (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)

#artificialintelligenceFeb-23-2019, 21:22:13 GMT

Minimax or Maximin? – Becoming Human: Artificial Intelligence Magazine

Minimax, as the name suggest, is a method in decision theory for minimizing the maximum loss. Alternatively, it can be thought of as maximizing the minimum gain, which is also know as Maximin. It all started from a two player zero-sum game theory, covering both the cases where players take alternate moves and those where they made simultaneous moves. It has also been extended to more complex games and to general decision making in the presence of uncertainty. In the above explanation, it has been mentioned that the minimax algorithms started off with the concept of zero-sum.

alpha and beta, max node, node, (13 more...)

#artificialintelligence

Industry: Leisure & Entertainment > Games (0.34)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)

Kaufmann, Emilie, Koolen, Wouter

Monte-Carlo Tree Search by Best Arm Identification

arXiv.org Machine LearningNov-6-2017

We consider two-player zero-sum turn-based interactions, in which the sequence of possible successive moves is represented by a maximin game tree T. This tree models the possible actions sequences by a collection of MAX nodes, that correspond to states in the game in which player A should take action, MIN nodes, for states in the game in which player B should take action, and leaves which specify the payoff for player A. The goal is to determine the best action at the root for player A. For deterministic payoffs this search problem is primarily algorithmic, with several powerful pruning strategies available [20]. We look at problems with stochastic payoffs, which in addition present a major statistical challenge. Sequential identification questions in game trees with stochastic payoffs arise naturally as robust versions of bandit problems. They are also a core component of Monte Carlo tree search (MCTS) approaches for solving intractably large deterministic tree search problems, where an entire sub-tree is represented by a stochastic leaf in which randomized play-out and/or evaluations are performed [4]. A play-out consists in finishing the game with some simple, typically random, policy and observing the outcome for player A. For example, MCTS is used within the AlphaGo system [21], and the evaluation of a leaf position combines supervised learning and (smart) play-outs. While MCTS algorithms for Go have now reached expert human level, such algorithms remain very costly, in that many (expensive) leaf evaluations or play-outs are necessary to output the next action to be taken by the player. In this paper, we focus on the sample complexity of Monte-Carlo Tree Search methods, about which very little is known. For this purpose, we work under a simplified model for MCTS already studied by [22], and that generalizes the depth-two framework of [10].

algorithm, artificial intelligence, planning & scheduling, (17 more...)

arXiv.org Machine Learning

1706.02986

Country: Europe (0.46)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games > Go (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)

Grill, Jean-Bastien, Valko, Michal, Munos, Remi

Blazing the trails before beating the path: Sample-efficient Monte-Carlo planning

Neural Information Processing SystemsDec-31-2016

We study the sampling-based planning problem in Markov decision processes (MDPs) that we can access only through a generative model, usually referred to as Monte-Carlo planning. Our objective is to return a good estimate of the optimal value function at any state while minimizing the number of calls to the generative model, i.e. the sample complexity. We propose a new algorithm, TrailBlazer, able to handle MDPs with a finite or an infinite number of transitions from state-action to next states. TrailBlazer is an adaptive algorithm that exploits possible structures of the MDP by exploring only a subset of states reachable by following near-optimal policies. We provide bounds on its sample complexity that depend on a measure of the quantity of near-optimal states. The algorithm behavior can be considered as an extension of Monte-Carlo sampling (for estimating an expectation) to problems that alternate maximization (over actions) and expectation (over next states). Finally, another appealing feature of TrailBlazer is that it is simple to implement and computationally efficient.

node, sample complexity, trailblazer, (13 more...)

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Europe > France > Hauts-de-France > Pas-de-Calais (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Aravamuthan, Sarang, Ganguly, Biswajit

e-Valuate: A Two-player Game on Arithmetic Expressions -- An Update

arXiv.org Artificial IntelligenceSep-12-2014

e-Valuate is a game on arithmetic expressions. The players have contrasting roles of maximizing and minimizing the given expression. The maximizer proposes values and the minimizer substitutes them for variables of his choice. When the expression is fully instantiated, its value is compared with a certain minimax value that would result if the players played to their optimal strategies. The winner is declared based on this comparison. We use a game tree to represent the state of the game and show how the minimax value can be computed efficiently using backward induction and alpha-beta pruning. The efficacy of alpha-beta pruning depends on the order in which the nodes are evaluated. Further improvements can be obtained by using transposition tables to prevent reevaluation of the same nodes. We propose a heuristic for node ordering. We show how the use of the heuristic and transposition tables lead to improved performance by comparing the number of nodes pruned by each method. We describe some domain-specific variants of this game. The first is a graph theoretic formulation wherein two players share a set of elements of a graph by coloring a related set with each player looking to maximize his share. The set being shared could be either the set of vertices, edges or faces (for a planar graph). An application of this is the sharing of regions enclosed by a planar graph where each player's aim is to maximize the area of his share. Another variant is a tiling game where the players alternately place dominoes on a $8 \times 8$ checkerboard to construct a maximal partial tiling. We show that the size of the tiling $x$ satisfies $22 \le x \le 32$ by proving that any maximal partial tiling requires at least $22$ dominoes.

artificial intelligence, minimax value, node, (15 more...)

1202.0862

Country: North America (0.28)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)